Multifactor Fusion for Audio-Visual Speaker Recognition

نویسنده

  • GIRIJA CHETTY
چکیده

In this paper we propose a multifactor hybrid fusion approach for enhancing security in audio-visual speaker verification. Speaker verification experiments conducted on two audiovisual databases, VidTIMIT and UCBN, show that multifactor hybrid fusion involve a combination feature-level fusion of lip-voice features and face-lip-voice features at score-level is indeed a powerful technique for speaker identity verification, as it preserves synchronisation . of the closely coupled modalities, such as face, voice and lip dynamics of a speaker during speech, through various stages of authentication. An improvement in error rate of the order of 22-36% is achieved for experiments by using feature level fusion of acoustic and visual feature vectors from lip region as compared to classical late fusion approach. Key-Words: Audio-visual, Multifactor, Hybrid Fusion, Speaker recognition, Impostor attacks

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Examination of Audio-visual Fused Hmms for Speaker Recognition

Fused hidden Markov models (FHMMs) have been shown to work well for the task of audio-visual speaker recognition, but only in an output decision-fusion configuration of both the audioand video-biased versions of the FHMM structure. This paper looks at the performance of the audioand video-biased versions independently, and shows that the audio-biased version is considerably more capable for spe...

متن کامل

Weight Estimation for Audio-Visual Multi-level Fusion in Bimodal Speaker Identification

This paper investigates the estimation of fusion weights under varying acoustic noise conditions for audio-visual multi-level hybrid fusion strategy in speaker identification. The multi-level fusion combines model level and decision level fusion via dynamic Bayesian networks (DBNs). A novel methodology known as support vector regression (SVR) is utilized to estimate the fusion weights directly ...

متن کامل

Audio-visual multilevel fusion for speech and speaker recognition

In this paper we propose a robust audio-visual speech-andspeaker recognition system with liveness checks based on audio-visual fusion of audio-lip motion and depth features. The liveness verification feature added here guards the system against advanced spoofing attempts such as manufactured or replayed videos. For visual features, a new tensor-based representation of lip motion features, extra...

متن کامل

A New Approach to Integrate Audio and Visual Features of Speech

This paper presents a novel fused-hidden Markov model (fused-HMM) to integrate the audio and visual features of speech. In this model, audio and visual HMMs built individually are fused together using a general probabilistic fusion method, which is optimal in the maximum entropy sense. Specifically, the fusion method uses the dependencies between the audio hidden states and the visual observati...

متن کامل

Multi-level Fusion of Audio and Visual Features for Speaker Identification

This paper explores the fusion of audio and visual evidences through a multi-level hybrid fusion architecture based on dynamic Bayesian network (DBN), which combines model level and decision level fusion to achieve higher performance. In model level fusion, a new audio-visual correlative model (AVCM) based on DBN is proposed, which describes both the intercorrelations and loose timing synchroni...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007